skip to main content


Search for: All records

Creators/Authors contains: "Yang, Jie"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available November 2, 2024
  2. Abstract

    Forest trees provide critical ecosystem services for humanity that are under threat due to ongoing global change. Measuring and characterizing genetic diversity are key to understanding adaptive potential and developing strategies to mitigate negative consequences arising from climate change. In the area of forest genetic diversity, genetic divergence caused by large-scale changes at the chromosomal level has been largely understudied. In this study, we used the RNA-seq data of 20 co-occurring forest trees species from genera including Acer, Alnus, Amelanchier, Betula, Cornus, Corylus, Dirca, Fraxinus, Ostrya, Populus, Prunus, Quercus, Ribes, Tilia, and Ulmus sampled from Upper Peninsula of Michigan. These data were used to infer the origin and maintenance of gene family variation, species divergence time, as well as gene family expansion and contraction. We identified a signal of common whole genome duplication events shared by core eudicots. We also found rapid evolution, namely fast expansion or fast contraction of gene families, in plant–pathogen interaction genes amongst the studied diploid species. Finally, the results lay the foundation for further research on the genetic diversity and adaptive capacity of forest trees, which will inform forest management and conservation policies.

     
    more » « less
  3. Categorical data analysis becomes challenging when high-dimensional sparse covariates are involved, which is often the case for omics data. We introduce a statistical procedure based on multinomial logistic regression analysis for such scenarios, including variable screening, model selection, order selection for response categories, and variable selection. We perform our procedure on high-dimensional gene expression data with 801 patients, 2426 genes, and five types of cancerous tumors. As a result, we recommend three finalized models: one with 74 genes achieves extremely low cross-entropy loss and zero predictive error rate based on a five-fold cross-validation; and two other models with 31 and 4 genes, respectively, are recommended for prognostic multi-gene signatures.

     
    more » « less
    Free, publicly-accessible full text available September 1, 2024
  4. Thrall, Peter H. (Ed.)
    Abstract

    Metabolomics provides an unprecedented window into diverse plant secondary metabolites that represent a potentially critical niche dimension in tropical forests underlying species coexistence. Here, we used untargeted metabolomics to evaluate chemical composition of 358 tree species and its relationship with phylogeny and variation in light environment, soil nutrients, and insect herbivore leaf damage in a tropical rainforest plot. We report no phylogenetic signal in most compound classes, indicating rapid diversification in tree metabolomes. We found that locally co‐occurring species were more chemically dissimilar than random and that local chemical dispersion and metabolite diversity were associated with lower herbivory, especially that of specialist insect herbivores. Our results highlight the role of secondary metabolites in mediating plant–herbivore interactions and their potential to facilitate niche differentiation in a manner that contributes to species coexistence. Furthermore, our findings suggest that specialist herbivore pressure is an important mechanism promoting phytochemical diversity in tropical forests.

     
    more » « less
    Free, publicly-accessible full text available November 1, 2024
  5. Free, publicly-accessible full text available August 9, 2024
  6. Rapid and ultrasensitive point-of-care RNA detection plays a critical role in the diagnosis and management of various infectious diseases. The gold-standard detection method of reverse transcription-quantitative polymerase chain reaction (RT-qPCR) is ultrasensitive and accurate yet limited by the lengthy turnaround time (1-2 days). On the other hand, antigen test offers rapid at-home detection (15-20 min) but suffers from low sensitivity and high false-negative rates. An ideal point-of-care diagnostic device would combine the merits of PCR-level sensitivity and rapid sample-to-result workflow comparable to antigen testing. However, the existing RNA detection platform typically possesses superior sensitivity or rapid sample-to-result time, but not both. This paper reports a point-of-care microfluidic device that offers ultrasensitive yet rapid detection of viral RNA from clinical samples. The device consists of a microfluidic chip for precisely manipulating small volumes of samples, a miniaturized heater for viral lysis and ribonuclease (RNase) inactivation, a CRISPR Cas13a- electrochemical sensor for target preamplification-free and ultrasensitive RNA detection, and a smartphone-compatible potentiostat for data acquisition. As demonstrations, the devices achieve the detection of heat-inactivated SARS-CoV-2 samples with a limit of detection (LOD) down to 10 aM within 25 minutes, which is comparable to the sensitivity of RT-PCR and rapidness of antigen test. The platform also successfully distinguishes all nine positive unprocessed clinical SARS-CoV-2 nasopharyngeal swab samples from four negative samples within 25 minutes of sample-to-result time. Together, this device provides a point-of-care solution that can be deployed in diverse settings beyond laboratory environments for rapid and accurate detection of RNA from clinical samples. The device can potentially be expandable to detect other viral targets, such as human immunodeficiency virus (HIV) self-testing and Zika virus, where rapid and ultrasensitive point-of-care detection is required. 
    more » « less
    Free, publicly-accessible full text available July 26, 2024
  7. We propose a systematic application-specific hardware design methodology for designing Spiking Neural Network (SNN), SNNOpt, which consists of three novel phases: 1) an Olliver-Ricci-Curvature (ORC)-based architecture-aware network partitioning, 2) a reinforcement learning mapping strategy, and 3) a Bayesian optimization algorithm for NoC design space exploration. Experimental results show that SNNOpt achieves a 47.45% less runtime and 58.64% energy savings over state-of-the-art approaches. 
    more » « less
    Free, publicly-accessible full text available June 11, 2024
  8. Sparse data with a high portion of zeros arise in various disciplines. Modeling sparse high-dimensional data is a challenging and growing research area. In this paper, we provide statistical methods and tools for analyzing sparse data in a fairly general and complex context. We utilize two real scientific applications as illustrations, including a longitudinal vaginal microbiome data and a high dimensional gene expression data. We recommend zero-inflated model selections and significance tests to identify the time intervals when the pregnant and non-pregnant groups of women are significantly different in terms of Lactobacillus species. We apply the same techniques to select the best 50 genes out of 2426 sparse gene expression data. The classification based on our selected genes achieves 100% prediction accuracy. Furthermore, the first four principal components based on the selected genes can explain as high as 83% of the model variability.

     
    more » « less